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We consider the problem of nonparametric quantile regression for twice censored data. 
Two new estimates are presented, which are constructed by applying concepts of monotone 
rearrangements to estimates of the conditional distribution function. The proposed methods 
avoid the problem of crossing quantile curves. Weak uniform consistency and weak conver- 
gence is established for both estimates and their finite sample properties are investigated by 
means of a simulation study. As a by-product, we obtain a new result regarding the weak 
convergence of the Beran estimator for right censored data on the maximal possible domain. 
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*0 1 Introduction 

;-( 

Quantile regression offers great flexibility in assessing covariate effects on event times. The method 



was introduced by Koenker and Bassett (1978 ) as a supplement to least squares methods focussing 



on the estimation of the conditional mean function and since this seminal work it has found 



numerous applications in different fields [see Koenker (2005)]. Recently Koenker and Geling 



(2001) have proposed quantile regression techniques as an alternative to the classical Cox model 



for analyzing survival times. These authors argued that quantile regression methods offer an 



* Supported by the Sonderforschungsbereich "Statistical modeling of nonlinear dynamic processes" (SFB 823) 
of the Deutsche Forschungsgemeinschaft. 

1 



interesting alternative, in particular if there is lieteroscedasticity in the data or inhomogeneity 



in the population, which is a common phenomenon in survival analysis [see Portnoy (2003) 



Unfortunately the "classical" quantile regression techniques cannot be directly extended to survival 
analysis, because for the estimation of a quantile one has to estimate the censoring distribution 
for each observation. As a consequence rather stringent assumptions are required in censored 



regression settings. Early work by Powell (1984, 1986), requires that the censoring times are 



always observed. Moreover, even under this rather restrictive and - in many cases - not realistic 
assumption the objective function is not convex, which results in some computational problems [see 



for example Fitzenberger (1997)]. Even worse, recent research indicates that using the information 



contained in the observed censored data actually reduces the estimation accuracy [see Koenker 



(2008) 



Because in most survival settings the information regarding the censoring times is incomplete 
several authors have tried to address this problem by making restrictive assumptions on the 



censoring mechanism. For example, Ying et al. (1995) assumed that the responses and censoring 



times are independent, which is stronger than the usual assumption of conditional independence. 



Yang (1999) proposed a method for median regression under the assumption of i.i.d. errors, which 



is computationally difficult to evaluate and cannot be directly generalized to the heteroscedastic 



case. Recently, Portnoy (2003) suggested a recursively re- weighted quantile regression estimate 



under the assumption that the censoring times and responses are independent conditionally on the 
predictor. This estimate adopts the principle of self consistency for the Kaplan-Meier statistic [see 



Efron (1967)] and can be considered as a direct generalization of this classical estimate in survival 



analysis. Peng and Huang (2008) pointed out that the large sample properties of this recursively 



defined estimate are still not completely understood and proposed an alternative approach, which 
is based on martingale estimating equations. In particular, they proved consistency and asymptotic 
normality of their estimate. 

While all of the cited literature considers the classical linear quantile regression model with right 
censoring, less results are available for quantile regression in a nonparametric context. Some 



results on nonparametric quantile regression when no censoring is present can be found in Chaud- 



huri (1991) and Yu and Jones (1997 1998). Chernozhukov et al. (2006) and Dette and Volgushev 



(2008) pointed out that many of the commonly proposed parametric or nonparametric estimates 



lead to possibly crossing quantile curves and modified some of these estimates to avoid this prob- 
lem. Results regarding the estimation of the conditional distribution function from right censored 



data can be found in Dabrowska (1987, 1989) or Li and Doss (1995). The estimation of condi- 



tional quantile functions in the same setting is briefly stressed in Dabrowska (1987) and furtfier 



elaborated in Dabrowska (1992a), while El Ghouch and Van Keilegom (2008) proposed a quantile 



regression procedure for right censored and dependent data. On the other hand, the problem of 
nonparametric quantile regression for censored data where the observations can be censored from 
either left or right does not seem to have been considered in the literature. 

This gap can partially be explained by the difficulties arising in the estimation of the conditional 
distribution function with two-sided censored data. The problem of estimating the (unconditional) 
distribution function for data that may be censored from above and below has been considered by 



several authors. For an early reference see TurnbuU (1974). More recent references are Chang and 



Yang (1987); Chang (1990); Gu and Zhang (1993) and Patilea and Rolin (2006). On the other 



hand- to their best knowledge- the authors are not aware of literature on nonparametric conditional 
quantile regression, or estimation of a conditional distribution function, for left and right censored 
data when the censoring is not always observed and only the conditional independence of censoring 
and lifetime variables is assumed. 

In the present paper we consider the problem of nonparametric quantile regression for twice 



censored data. We consider a censoring mechanism introduced by Patilea and Rolin (2006) and 



propose an estimate of the conditional distribution function in several steps. On the basis of this 
estimate and the preliminary statistics which are used for its definition, we construct two quantile 



regression estimates using the concept of simultaneous inversion and isotonization [see Dette et al. 
(2005)] and monotone rearrangements [see Dette et al.| (|2006), Chernozhukov et al. (2006) or 



Anevski and Fougeres (2007) among others]. In Section p^ we introduce the model and the two 
estimates, while Section [3] contains our main results. In particular, we prove uniform consistency 
and weak convergence of the estimates of the conditional distribution function and its quantile 
function. As a by-product we obtain a new result on the weak convergence of the Beran estimator 
on the maximal possible interval, which is of independent interest. In Section 4 we illustrate the 
finite sample properties of the proposed estimates by means of a simulation study. Finally, all 
proofs and technical details are deferred to an Appendix. 



2 Model and estimates 



We consider independent identically distributed random vectors (Tj, Lj, Ri, Xj), i = 1, . . . ,n, where 
Ti are the variables of interest, Lj and Ri are left and right censoring variables, respectively, and 
the ^'^- valued random variables X denote the covariates. We assume that the distributions of 



the random variables Lj, Ri and Tj depend on Xj and denote by FL{t\x) := P{L < t\X = x) the 
conditional distribution function of L given X = x. The conditional distribution functions F{i{.\x) 
and Ft{.\x) are defined analogously. 

Additionally, we assume that the random variables Ti,Li,Ri are almost surely nonnegative and 
independent conditionally on the covariate Xi. Our aim is to estimate the conditional quantile 
function F^^{.\x). However, due to the censoring, we can only observe the triples (Fj, Xj, 5,) where 
Yi = max(min(Tj, Ri), Li) and the indicator variables 6i are defined by 



(2.1) 



5,.:-- 



, Li < li < Ri 
, Li < Ri < li 
, Ti < Li < Ri or Ri < Li. 



Remark 2.1 An unconditional version of this censoring mechanism was introduced by Patilea 



and Rolin (2006). Examples of situations where this kinds of data occur can for example be 



found in chapter 15 of Meeker and Escobar (1998). This model also is closely related to the 



double censoring model, see TurnbuU (1974) for the case without covariates. In that setting, the 



assumption of independence between the random variables L, R, T is replaced by the assumption 
that T is independent of the pair {R, L) and additionally P{L < R) = 1. Note that none of the 
two assumptions is strictly more or less restrictive then the other. Rather the two models describe 
different situations. Moreover, since L,T,R are never observed simultaneously, it is not possible 
to decide which of the models is most approriate. Instead, an understanding of the underlying 
data generation process is crucial to identify the right model. A more detailed comparison of the 



two models can be found in Patilea and Rolin (2001) and Patilea and Rolin (2006) for the case 
without covariates. 



Roughly speaking, the construction of an estimate for the conditional quantile function of T can 
be accomplished in three steps. First, we define the variables Si := min(Tj,i?j) and consider the 
model Yi = max(S'j, Li), which is a classical right censoring model. In this model we estimate the 
conditional distribution Fl{.\x) of L. In a second step, we use this information to reconstruct the 
conditional distribution of T [see Section 2.1]. Finally, the concept of simultaneous isotonization 



and inversion [see Dette et al. (2005)] and the monotone rearrangements, which was recently 



introduced by Dette et al. (2006) in the context of monotone estimation of a regression function. 



are used to obtain two estimates of the conditional quantile function [see Section 2.2]. 



2.1 Estimation of the conditional distribution function 



To be more precise, let H denote the conditional distribution of Y . We introduce the notation 
Hk{A\x) = F(An {5 = k}\X = x) and obtain the decomposition H = Hq + Hi + H2 for the 
conditional distribution of 1^. The sub-distribution functions Hk {k = 0, 1,2) can be represented 
as follows 

(2.2) Ho{dt\x) = FL{t - |x)(l - FR{t - |x))Fr(t/t|x) 

(2.3) Hi{dt\x) = FL{t-\x){l-FTit\x))FR{dt\x) 

(2.4) H2idt\x) = {l-il-FT{t\x)){l~FR{t\x))}FLidt\x)=Fs{t\x)FLidt\x). 

Note that the conditional (sub-) distribution functions Hk and H can easily be estimated from the 
observed data by 



(2.5) 



Hk,n{t\x):=J2^^(^)^- 



{Y,<tA=k}, 



Hn{t\x) ■.= ^Wi{x)I{Y^<t}, 



i=l 



i=l 



where the quantities Wi{x) denote local weights depending on the covariates Xi, ..., X„, which will 



be specified below. We will use the representations (2.2) - (2.4) to obtain an expression for Ft in 



terms of the functions if, H^ and then replace the distribution functions if, H^ by their empirical 
counterparts if„, H^n-, respectively. We begin with the reconstruction of F^. First note that 



(2.6) 



M2{dt\x) 



H2{dt\x) Fs{t\x)FL{dt\x) FL{dt\x) 



H{t\x) FL{t\x)Fs{t\x) FL{t\x) 

is the predictable reverse hazard measure corresponding to F^ and hence we can reconstruct F^ 
using the product-limit representation 



(2.7) 



FL{t\x) = n (^ ~ M2{ds\x)) 

(t,oo] 



see e.g. Patilea and Rolin (|2006[)]. Now having a representation for the conditional distribution 

HQ{dt\x) 



function F^ we can define in a second step 

HQ{dt\x) 
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kj.{dt\x) 



FL{t - \x) - H{t - \x) FL{t - \x){l - Fs{t - |x)) 

Ho{dt\x) 
~ FlU - \x){l - Fait - \x)){l - Frit - \x)) 

FL{t - \x){l - FR{t - \x))FT{dt\x) FT{dt\x) 



FL{t - \x){l - FR{t - \x)){l - Frit - \x)) 1 - Fr(t 



\x) 



which yields an expression for the predictable hazard measure of Ft- Finally, Ft can be recon- 
structed by using the pro duct- limit representation 



(2.9) 



1 - FT(t\x) = JJ(1 - A^idslx)) 

[0,t] 



see e.g. Gill and Johansen (1990)]. Note that formula (2.9) yields an explicit representation of the 



conditional distribution function Ft{.\x) in terms of the quantities Hq,Hi,H2,H, which can be 



estimated from the data [see equation (^2.5)]. The estimate of the conditional distribution function 



is now defined as follows. First, we use the representation (2.7) to obtain an estimate of Fl{.\x) 

that is 

(2.10) FL,n{t\x) = n (1 - M^Ads\x)), 



(t,oo] 



where 
(2.11) 



M^Jdslx) 



H2,n{ds\x) 

Hnis\x) 



Second, after observing (2.8|) and (|2.9|), we define 
(2.12) 



FT,n{t\x) = 1 - n^l - ^tJHx)), 



where 

(2.13) 



A^ (ds\x) 



Ho,n{ds\x) 



FlJs - \x) - HJs 



In Section 3 we will analyse the asymptotic properties of these estimates, while in the following 
Section 2.2 these estimates are used to construct nonparametric and noncrossing quantile curve 
estimates. 

Remark 2.2 Throughout this paper, we will adopt the convention '0/0 = 0'. This means that if, 
for example, HQ^n{.dt\x) = and F^nit — \x) — Hn(t — \x) = 0, the contribution of 

Ho^n{dt\x) 

FiAt ~ \x) - Hn{t - \x) 



in (2.13) will be interpreted as zero. 



2.2 Non-crossing quantile estimates by monotone rearrangements 

In practice, nonparametric estimators of a conditional distribution function F{.\x) are not neces- 
sarily increasing for finite sample sizes [see e.g. Yu, Jones (1998)]. Although this problem often 



vanishes asymptotically, it still is of great practical relevance, because in a concrete application it 
is not completely obvious how to invert a non-increasing function. Trying to naively invert such 



estimators may lead to the well-known problem of quantile crossing [see Koenker (2005) or Yu 



and Jones (1998)] which poses some difficulties in the interpretation of the results. In this paper 



we will discuss the following two possibilities to deal with this problem 



1. Use a procedure developed by Dette and Volgushev (2008) which is based on a simultaneous 



isotononization and inversion of a nonincreasing distribution function. As a by-product this 
method yields non-crossing quantile estimates. To be precise, we consider the operator 



(2.14) 



^ : 



L°°(J) -^ L^{R) 

f^ {y ^ Ij hf(u)<y}du) 

where L°° (J) denotes the set of bounded, measurable functions on the set / and J denotes 
a bounded interval. Note that for a strictly increasing function / this operator yields the 
right continuous inverse of /, that is ^E'(/) = /^^ [here and in what follows, f~^ will denote 
the generalized inverse, i.e. f~^(t) := sup{s : /(s) < t}]. On the other hand, \I/(/) is always 
isotone, even in the case where / does not have this property. Consequently, if / is a not 
necessarily isotone estimate of an isotone function /, the function \l/(/) could be regarded as 
an isotone estimate of the function f~^. Therefore, the first idea to construct an estimate of 
the conditional quantile function consists in the application of the operator \1/ to the estimate 



Ft^u defined in (2.12), i.e. 

(2.15) g(r|x) = ^(FT,„(.|x))(r). 

However, note that formally the mapping \1/ operates on functions defined on bounded 
intervals. More care is necessary if the operator has to be applied to a function with an 
unbounded support. A detailed discussion and a solution of this problem can be found 
in Dette and Volgushev (2008). In the present paper we use different approach which is 



a slightly modified version of the ideas from Anevski and Fougeres (2007). To be precise 



note that estimators of the conditional distribution function F{.\x) [in particular those of 



the form (2.5), which will be used later] often are constant outside of the compact interval 
J := [ji, J2] = [iniuj Fj , maxj Fj] . Now the structure of the estimator FT^n{-\x) implies that 
FT,n{-\x) will also be constant outside of J. We thus propose to consider the modified 
operator \^j defined as 



(2.16) 



^ 



f ^ {y ^ Ji + Ij hfH<y}du) 



Consequently the first estimator of tlie conditional quantile function is given by 



(2.17) 



g(r|x) = ^j(FT^,„(.|x))(r). 



2. Use the concept of increasing rearrangements [see Dette et al. (2006) and Chernozhukov 



et al. (2006) for details] to construct an increasing estimate of the conditional distribution 



function, which is then inverted in a second step. More precisely, we define the operator 



(2.18) 



$ : 



L°°(J)->L°°(iR) 



where \E' is introduced in (2.14). Note that for a strictly increasing right continuous function 



/ this operator reproduces /, i.e. $(/) = /. On the other hand, if / is not isotone, $(/) is 
an isotone function and the operator preserves the L^-norm, i.e. 



^fiu))fdu= / \fiu)fdu. 
J J J 



Moreover, the operator also defines a contraction, i.e. 



^h)iu) - ^f2)iu)f du < / l/i 
J J J 



f2\^ du Vp>l 



see 



Hardy et ah (1988) or 



Lorentz 



(1953)]. This means if /(= /i) is a not necessarily isotone 



estimate of the isotone function /(= /2), then the isotonized estimate $(/) is a better 
approximation of the isotone function / than the original estimate / with respect to any 
L^-norm [note that $(/) = / because / is assumed to be isotone]. For a general discussion 



of monotone rearrangements and the operators (2.14) and (2.18) we refer to Bennett and 



Sharpley (1988), while some statistical applications can be found in Dette et al. (2006) and 



Chernozhukov et al. (2006). 



The idea is now to use rearranged estimators of Hi{.\x) and H{.\x) in the representations 



(2.6)-(2.9). For this purpose we need to modify the operator $ so that it can be applied to 



functions of unbounded support. We propose to proceed as follows 

• Define the operator $j indexed by the compact interval J = [ji, J2] as 



(2.19) $. 



L°°(iR) ^ L°^{R) 

f^(y^ I{y<n}fUi-) + i^jfi-))~\y)hh<y<h} + hy>n}fU2) 



Truncate the estimator Hn{-\x) for values outside of the interval [0, 1], i.e. 

Hn{t\x) := Hn{t\x)I{Hn{t\x)e[0,l]} + I{H„{t\x)>l} 



[note that in general estimators of the form (2.5) do not necessarily have values in the 
interval [0, 1] since the weights Wi{x) might be negative] 

• Use the statistic Hl^(t\x) := <l>jy(i/„(-|x))(t) as estimator for H(t\x). 

• Observe that the estimator H^^{t\x) is by construction an increasing step function 
which can only jump in the points t = Yi, i.e. it admits the representation 

(2.20) Hi^{t\x) = J2wri^)I{y.<t} 

i 

with weights Wl^{x) > 0. Based on this statistic, we define estimators Hl^ of the 
subdistribution functions H^ as follows 

(2.21) Hi^^it\x) = J2 Wrix)I{Y.<t}I{s.=k}, fc = 0, 1, 2 

i 

In particular, such a definition ensures that H^^(t\x) = HQ^(t\x) + H[^(t\x) + H2^{t\x) . 
So far we have obtained increasing estimators of the quantities H and Hi. The next step in 



our construction is to plug these estimates in representation (2.6) to obtain: 

(2.22) Mi„(*|.) = ^g^. 

which defines an increasing function with jumps of size less or equal to one. This implies 

that FL^n(t\x) = n(tooi(^ ~ ^2,nid-s\x)) is also increasing. For the rest of the construction, 
observe the following Lemma which will be proved at the end of this section. 

Lemma 2.3 Assume that Yi ^ Yj for i ^ j ■ Then the function 

H^!nidt\x) 



(2.23) ^T,nidt\x) := ^ 

FlA^ - f) - Hl^^{t - \x} 

is nonnegative, increasing and has jumps of size less or equal to one. 

This in turn yields the estimate 

(2.24) F^^it\x) = 1 - J](l - ~A^Ad^\x)). 

[0,t] 
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In the final step we now simply invert the resulting estimate of the conditional distribu- 
tion function Fj^^ since it is increasing by construction. We denote this estimator of the 
conditional quantile function by 



(2.25) 



q^^{t\x) := sup {s : F^^^{s\x) < t] 



In the next section, we will discuss asymptotic properties of the two proposed estimates q and q^^ 
of the conditional quantile curve. 

Remark 2.4 In the classical right censoring case, there is no uniformly good way to define the 



Kaplan-Meier estimator beyond the largest uncensored observation [see e.g. Fleming and Harring- 



ton 



(1991), page 105]. Typical approaches include setting it to unity, to the value at the largest 



uncensored observation, or to consider it unobservable within certain bounds [for more details. 



see the discussion in Fleming and Harrington (1991), page 105 and Anderson et al. (1993), page 



260]. When censoring is light, the first of the above mentioned approaches seems to yield the best 



results [see Anderson et al. (1993), page 260]. 

When the data can be censored from either left or right, the situation becomes even more com- 
plicated since now we also have to find a reasonable definition below the smallest uncensored 



observation. From definitions (2.6)-(2.9) it is easy to see that Fx.n equals zero below the small- 



est uncensored observation with non-vanishing weight and is constant at the largest uncensored 
observation and above. In practice, the latter implies that the estimators (?(t|x) and g^^(r|x) 
are not defined as soon as sup ^ Ft ^n(t\x) < r or supjFy^(t|x) < r, respectively. A simple ad-hoc 
solution to this problem is to define the estimator Fx^n or Fj^^ as 1 beyond the last observation 
with non-vanishing weight or to locally increase the bandwidth. A detailed investigation of this 
problem is postponed to future research. 



We conclude this section with the proof of Lemma |2.3 
Proof of Lemma 



2.3 



In order to see that Arp^(dt\x) is increasing, we note that 



ffr(i - k) 



n 

[t,oo) 



1- 



< 



nil 

[i,oo) 



Hnids\x) 
Hi%ds\x) 



n 

[t,oo) 



1- 



Hi;:{ds\x) Hl^{ds\x) + HJlids 
HiP{s\x) Hi^{s\x) 



x] 



FL,n{t 



\X] 



Thus Fi^nit — \x) — Hlf{t —\x)>Q and the nonnegativity of Krp^[dt\x) is estabhshed. In order to 
prove the inequality k7p^[dt\x) < 1 we assume without loss of generality that Yi <Y2 < ■ ■ ■ < 1^„. 
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Observe that as soon as Sk = we have for k >2 



(*) 



(**) 



> 



FLAYk - \x) - W/{Yk - \x 



n(i 



WW) 



x] 



n(i 



[Yk,oo) 



[Yk,oo) 

n 

j>k,Sj^2 " ^ -"'"^ j>k+l,6j=2 

'Hi''{Y,_^\x)^l yj fHi^{Y,_,\x) 



Hin{ds\x) 

~WW) 



^Hll{Y,\x) + ^Hi^{Y,\x) 
HlP{Y,\x) 



J-l V mP(Y\x) 



HiP{Y,\x) 



n 



1- 



j>k,S,^2 

H'/{Yk.,\x 



H^^Y,\x 



n 



j>k+l,5j=2 

n ^"""' 

j>k+l,5j=/=2 



Hl^iYM) 



HiP{Y,.,\x) 



Hri'^inix) ,^,/A..o^ Hl^iY^lx) J\ ,^,y\\ Hi^{Y,\x) 



n 

j>k+l,Sj=2 



HiP{Y,_,\x] 



Hir{yk-i\x] 



HiP{Y,.,\x) 



HiP{Yk\x) - HiP{Yk.i\x)i Hi^iYklx] 



Hk^iyk\x) 



Hl^'iYnlx) 



Ai^rink) 



where the equahties (*) and (**) follow from 6k = 0. An analogous result for A; = 1 follows by 
simple algebra. Hence we have established that for 5^ = we have AA^^(Yfc|x) < 1, and all the 
other cases need not be considered since we adopted the convention '0/0=0'. Thus the proof is 
complete. □ 

3 Main results 



The results stated in this section describe the asymptotic properties of the proposed estimators. 
In particular, we investigate weak convergence of the processes {Hk^n(t\x)}t, {FT,n(t\x)}t, etc. 
where the predictor x is fixed. Our main results deal with the weak uniform consistency and the 
weak convergence of the process {FT,„(t|a;) — FT{t\x)}t and the corresponding quantile processes 
obtained in Section |2} In order to derive the process convergence, we will assume that it holds 
for the initial estimates if„, Hk^n and give sufficient conditions for this property in Lemma 3.3 



In a next step we apply the delta method [see Gill (1989)] to the map (H,H2) ^-> Mg defined in 



(2.6) and the product-limit maps defined in (2.7) and (2.9). Note that the product limit maps are 



Hadamard differentiable on the set of cadlag functions with total variation bounded by a constant 



see Lemma A.l on page 42 in Patilea and Rolin (2001)], and hence the process convergence of 
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M2n and A^„ will directly entail the weak convergence results for Fz,_„ and Fx^n, respectively. 
However, the Hadamard differentiability of the map {H2, H) i— t- Afg" only holds on domains where 
H{t) > £ > 0, and hence more work is necessary to obtain the corresponding weak convergence 
results on the interval [too; 00] if i^(took) = O5 where 

(3.1) too :=inf {t : //o(t|a;) >0}. 

This situation occurs for example if ^^(tool^;) = 0, which is quite natural in the context considered 
in this paper because R is the right censoring variable. 

For the sake of a clear representation and for later reference, we present all required technical con- 
ditions for the asymptotic results at the beginning of this section. We assume that the estimators 



of the conditional subdistribution functions are of the form (2.5) with weights Wj{x) depending 



on the covariates Xi, ..., X„ but not on Y^, ..., 1^ or ^i, ..., 5„. The first set of conditions concerns 



the weights that are used in the representation (2.5). Throughout this paper, denote by || ■ || the 
maximum norm on IR^. 



(Wl) With probability tending to one, the weights in (2.5) can be written in the form 



«-.(.) '■<^-' 



where the real- valued functions Vj {j = 1, . . . ,n) have the following properties: 

(1) There exist constants < c < c < 00 such that for all n G N and all x we have either 
Vj{x) = or c/nh'^ < Vj{x) < c/nh'^ 

(2) If ||x — Xjll < Ch for some constant C < 00, then Vj{x) 7^ and Vj{x) = for 
ll^; — -^ill > c„ for some sequence (c„)„gN such that c„, = 0{h). Without loss of 
generality, we will assume that C = 1 throughout this paper. 

(3) J2i ^i{^) = C{x){l + op(l)) for some positive function C. 



(4) sup, E^Viix)ix - X,)I{y^^ty = Op{l / ^ uh'^) . 

Here [and throughout this paper] h denotes a smoothing parameter converging to with 
increasing sample size. 

(W2) We assume that the weak convergence 

^/^d{H,^n{]x) - Ho{\x),H2,n{]^) - H2{.\x),Hrl\x) - H{.\x)) ^ [G^^G^^G) 
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holds in D^[0, oo], where the hinit denotes a centered Gaussian process which has a version 
with a.s. continuous sample paths and a covariance structure of the form 

CoY{Gi{s\x),Gi{t\x)) = b{x){Hi{sAt\x)-H,{s\x)Hi{t\x)) 

CoY{G{s\x),G{t\x)) = b{x){H{sAt\x)~H{s\x)H{t\x)) 

Cov{Gi{s\x),G{t\x)) = b{x){H,{sAt\x)-H,{s\x)H{t\x)) 

for some function b{x). Here and throughout this paper weak convergence is understood as 
convergence with respect to the sigma algebra generated by the closed balls in the supremum 
norm [see 



Pollard (1984) 



(W3) The estimators Hkn{-\x) {k = 0,1,2) and if„(.|x) are weakly uniformly consistent on the 
interval [0, oo) 



Remark 3.1 It will be shown in Lemma |3.3| below that, under suitable assumptions on the 
smoothing parameter h, important examples for weights satisfying conditions (Wl) |(W3) 
given by the Nadaraya- Watson weights 



are 



(3.2) 



Wl^'^ix) 



hUt=iKh{{x-X, 



ih<* 



i k 



K 



NWi 



X] 



or (in one dimension) by the local linear weights 

^K,,{x-X,){Sn^2-{x-X,)Sn,i) 



X) 



(3.3) 



Wl^'^ix) 



Q Q C2 

"Jn, 2^71,0 ^n,l 

^K^ix - X,) (1 - (a; - X,)5„,i/5„,2) 



V^'^(x) 



}-, E, K,{x - X,) (1 - (x - X,)^„,i/5„,2) ■ E, ^^^(^ 



nh ^-^j 



where -ft'/i(.) := K^.jK), Sn,k '■= ^ J2j Kh{x — Xj){x — XjY and the kernel satisfies the following 
condition. 



(Kl) The kernel K in (3.2) and (3.3) is a symmetric density of bounded total variation with 



compact support, say [—1,1], which satisfies ci < K{x) < C2 for all x with K{x) ^ for 
some constants < ci < C2 < oo. 

For the distributions of the random variables (Tj, Lj, i?j, Xj) we assume that for some e > with 
Ve{x) := {y:\y-x\< e} 

(Dl) The conditional distribution function Fr fulfills -^^(tool^;) < 1 
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(D2) For z = 0, 1, 2 we have limy_>^ sup^ \Hi(t\y) — Hi{t\x)\ = 

(D3) The conditional distribution functions F/;,(.|a;), Fr(.|x), -Ft(.|x) have densities, 
say fL{-\x),fji{.\x),fT{-\x), with respect to the Lebesque measure 



m i: 



IlJuIx) 



/too Fliu\x)Fsiu\x) 

(D5) supfc^i ,.^^/^ 



du < oo 



'too Fl(u\x)Fs(u\x) 

(D6) sup,.j^i^,i sup(^^2)g(ij,g^o^)xi/e(x) 






du < OO 



dz.d. 



fL{t\z) 



< oo 



(D7) The functions Hk{t\x) {k = 0, 1,2) are twice continuously differentiable with respect to the 
second component in some neighborhood Us{x) of x and for /c = 0, 1, 2 we have 

sup sup sup \dy^^dy^Hk{t\y)\ < cx) 

k,j=l,..,d t \y — x\<£ 

(D8) The distribution function Fx of the covariates Xj is twice continuously differentiable in 
Ue{x). Moreover, Fx has a uniformly continuous density fx such with fx{x) ^ 0. 

(D9) There exists a constant C > such that H(t\y) > CH(t\x) for all (t,y) G [tooj^oo + £^) x / 
where / is a set with the property Jj^jj ,-. fx{s)ds > c6'^ for some c > and all < 6 < e. 

(DIO) 1^ = 1^(1 + 0(1)) uniformly in t G [too, ^oo + e) as y -^ x 

(Dll) For TTfiix) := inf{t : FT{t\x) > 0} we have miy^u^ix) FL{TT,o{y)\y) > 0. 



Remark 3.2 From the definition of too aiid Hq we immediately see that under condition (Dl)| we 
have too = 'TT,o{x)\/TLfi{x) where we use the notation r^fiix) := inf{t : Fi(t|x) > 0}. In particular. 



this implies that under either of the assumptions (D4) or (Dll) the equality too = TT,oix) holds 



Finally, we make some assumptions for the smoothing parameter 

(Bl) n/i*^"*"^ log n = o(l) and nh — )■ oo. 
(B2) /i — 7- and nh'^/logn — > oo. 



Some important practical examples for weights satisfying conditions |(Wl) - (W3) include Nadaraya- 
Watson and local linear weights. This is the assertion of the next Lemma. 

Lemma 3.3 
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1. Conditions (Wl)(l) and (Wl)(2) are fulfilled for the Nadaray a- Watson weights Wf^^ with a 
Kernel K satisfying condition (Kl). If the density fx is continuous at the point x, condition 



(Wl)(3) also holds. Finally, if the function x h-)- /x(x)Fy(t|a;) is continuously differentiable 



in a neighborhood of x for every t with uniformly (in t) bounded first derivative and (Bl) is 



fulfilled, condition (Wl)(4) holds. 



If additionally to these assumptions d = 1 and the density fx of the covariates X is con- 
tinuously differentiable at x with bounded derivative, condition (Wl)\ also holds for the local 
linear and rearranged local linear weights W^^ and W}"^^ defined in ( 3.^ and (2.20), (2.21) 
respectively, provided that the corresponding kernel fulfills condition (Kl) . 



2. If under assumptions (D7) , \(D8) and (Bl) the density fx is twice continuously differen- 
tiable with uniformly bounded derivative, condition \(W2) holds for the Nadaray a- Watson (d 
arbitrary), local linear (d = 1) or rearranged local linear (d = 1) weights based on a positive, 
symmetric kernel with compact support. 



3. If under assumptions (B2), \ (D2) , \(D3) the density fx is twice continuously differentiable 



with uniformly bounded derivative, condition (W3) holds for the Nadaraya- Watson weights 
Wi based on a positive, symmetric kernel with compact support (d arbitrary). If additionally 
d = 1 and the density fx of the covariates X is continuously differentiable at x with bounded 



derivative, condition (W3) also holds for local linear or rearranged local linear weights. 



The proof of this Lemma is standard, a sketch can be found in the Appendix. 



Note that the assumption (Bl) does not allow to choose h 



~ n^'^/^'^+^)^ which would be the 
MSE-optimal rate for Nadaraya- Watson or local linear weights and functions with two continu- 
ous derivatives with respect to the predictor. This assumption has been made for the sake of a 
transparent presentation and implies that the bias of the estimates is negligible compared to the 
stochastic part. Such an approach is standard in nonparametric estimation for censored data, see 



Dabrowska (1987) or Li and Doss (1995). In principle, most results of the present paper can be 



extended to bandwidths /i ~ n ^/('^+^) if a corresponding bias term is subtracted. 

Another useful property of estimators constructed from weights satisfying condition |(W1) is that 
they are increasing with probability tending to one. 



Lemma 3.4 Under condition (Wl)(l) we have 



pi "The estimates {Hn{.\x), Horii-\x), Hin{.\x), H2ni-\x) are increasing" 
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The Lemma follows from the relation 



{"The estimates Hn{.\x) , Hon{-\x) , Hin{.\x) , H2n{-\x) are increasing"} ^ {Wi{x) > V z} 

and the fact that under assumption (Wl) the probability of the event on the right hand side 
converges to one. We will use Lemma |3.4| for the analysis of the asymptotic properties of the 



conditional quantile estimators in Section |3.2[ One noteworthy consequence of the Lemma is the 
fact that 



^1, 



Ffq^^{.\x) = q{.\x) 

which follows because the mappings \1' and the right continuous inversion mapping coincide on 
the set of nondecreasing functions. In particular, this indicates that, from an asymptotic point 
of view, it does not matter which of the estimators q, q^^ is used. The difference between both 
estimators will only be visible in finite samples - see Section |4j In fact, it can only occur if one of 
the estimators if„, H^n is decreasing at some point. 



3.1 Weak convergence of the estimate of the conditional distribution 

We are now ready to describe the asymptotic properties of the estimates defined in Section 2. Our 
first result deals with the weak uniform consistency of the estimate Fr^„(.|x) under some rather 
weak conditions. In particular, it does neither require the existence of densities of the conditional 
distribution functions [see (D3)| nor integrability conditions like (D4) 



Theorem 3.5 // conditions \(DTJ^ \(D3]\ \(D11)\ \(W1)\1)^^(WT]^ and \(W3^ are satisfied, th 
the following statements are correct. 



en 



1. The estimate FT^n{-\x) defined in (2.12) is weakly uniformly consistent on the interval [0, r] 
for any t such that Fs{t\x) < 1. 

2. If additionally Fs{rT^i{x)\x) = 1, where 

tt,i{x) := sup{t : FT{t\x) < 1}, 

and FT^n{-\x) is increasing and takes values in the interval [0, 1], the weak uniform consistency 
of the estimate FT^n{-\x) holds on the interval [0,oo). 

The next two results deal with the weak convergence of Ft^u and require additional assumptions 
on the censoring distribution. We begin with a result for the estimator F^^n, which is computed 



in the first step of our procedure by formulas (2.6) and (2.7) 
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Theorem 3.6 



1. Let the weights used for H2,n o,nd Hn in the definition of the estimate Mg^ in (2.11) satisfy 



conditions (Wl) and (W2) Moreover, assume that conditions (Bl)\ (Dl) and (D3) (DIO) 
hold. Then we have as n -^ oo 



V^{Hn - H, /7o,n - ^0, M-2 - M^ 



[G, Gq, G 



M 



in D^{[tQo, oo]), where {G, Go, Gm) denotes a centered Gaussian process with a.s. continuous 
sample paths and Guit) = A(t) — B{t) is defined by 



(3.4) 



A{t) 



dG2{u) 
t H{u\x)' 



Bit) :- 



G{u) 



H^iulx 



-H2{du\x). 



Here the process (Gq, G2, G) is specified in assumption (W2) and the integral with respect to 
the process 6*2 (^) is defined via integration- by-parts. 

2. Under the conditions of the first part we have 

V^{Hn - H, Ho,n - Ho, FL,n ' Fl) => {G, Go, G3) 



in D^{[too,oo]), where the process {Go,G2,G) is specified in assumption (W2) and G3 is a 
centered Gaussian process with a.s. continuous sample paths which is defined by 

Gs{t) = FL{t\x)GM{t). 
Remark 3.7 The value of the process Gm at the point too is defined as its path-wise hmit. The 



existence of this limit follows from assumption (D4) and the representation 

1 



E[GMis)GMit)] = bix) 



-M:^{du\x) 



'ss/t H{u\x)' 
for the covariance structure of Gm, which can be derived by computations similar to those in 



Patilea and Rolin (2001). 



Theorem 3.8 Assume that the conditions of Theorem 3.6 and condition (Dll) are satisfied. 



Moreover, let too < t such that Fs{[0, t]\x) < 1. Then we have the following weak convergence 
1. 



\/n/i^(Ay„-A^) ^ V 
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in D{[0,t]), where 



V{t) :- 



Go{du) 



G,iu-) - G{u-) 



Ho{du\x) 



,0 {Fl-H){u-\x) io {FL-Hy{u-\x) 

is a centered Gaussian process with a.s. continuous sample paths and the integral with respect 
to Go is defined via integration-by-parts. 



2. 



Vnf?{FT,n -Ft)^W 

in D{[0,t]), where 

W{t):={l-FTit\x))Vit), 

is a centered Gaussian process with a.s. continuous sample paths. 



Note that the second part of Theorem 3.8 follows from the first part using the representation 



(2.13) and the delta method. 



3.2 Weak convergence of conditional quantile estimators 

In this subsection we discuss the asymptotic properties of the two conditional quantile estimates 



q and q^^ defined in (2.17) and (2.25), respectively. As an immediate consequence of Theorem 3.5 



and the continuity of the quantile mapping [see Gill (1989), Proposition 1] we obtain the weak 
consistency result. 



Theorem 3.9 // the assumptions of the first part of Theorem 3j_5 are satisfied and additionally 
the conditions Fs{F^^{t\x)\x) < 1 and infe<t<^ /T(t|x) > hold some some e > 0, then the 



estimators q{.\x) and q^^{.\x) defined in (2.11) and (2.25) are weakly uniformly consistent on the 
interval [e, r] . 

The compact differentiability of the quantile mapping and the delta method yield the following 
result. 



Theorem 3.10 // the assumptions of Theorem 3^ are satisfied, then we have for any e > and 

r > with Fs{Fj:^ {t\x)\x) < 1 and infe<i<^ /r(t|x) > 



\fnh^{q{.\x)-F:^^{.\x))^ Z{.) on £>([£, r]), 
^^{q^^{.\x)-F^\.\x))^ Z{.) on £>([£, r]). 
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where Z is a centered Gaussian process defined by 

Z(.) = 



WoF^\.\x) 



M-lx) o F^\.\x) 



and the centered Gaussian process W is defined in part 2 of Theorem 3.8 



The proof Theorem 3.5 - |3.10 is presented in the Appendix A and requires several separate steps. 
A main step in the proof is a resuh regarding the weak convergence of the Beran estimator on the 
maximal possible domain in the setting of conditional right censorship. We were not able to find 
such a result in the literature. Because this question is of independent interest, it is presented 
separately in the following Subsection. 



3.3 A new result for the Beran estimator 



We consider the common conditional right censorship model [see Dabrowska (1987) for details]. 
Assume that our observations consist of the triples (Xj,Zj,Aj) where Zi = min(i?j, Dj), Aj = 
I{Zi=Di}, the random variables Bi, Di are independent conditionally on Xj and nonnegative almost 
surely. The aim is to estimate the conditional distribution function Fr, of Di. Following |Beran 



(1981 ) this can be done by estimating F^, the conditional distribution function of Z, and 'Kk{t\x) :- 



p(Zi <t,A, = k\X = xj {k = 0, 1) through 



(3.5) 



FzAA^) '■= Wi{x)hz..<t}^ 7rfc,„(t|x) := Wi{x)I{z,<tA^=k} {k = 0, 1) 



and then defining an estimator for Fo as 
(3.6) 



FD,n{t\^):=l-l[{l-An,nids\x)), 

[0,t] 



where the quantity A^^((is|a;) is given by 
(3.7) 



A- (dslx) - ^°-"^^^'^) 



and the Wi{x) denote local weights depending on Xi, ...,X„ [see also the discussion at the begin- 
ning of Section |3]. 

The weak convergence of the process vnh^{Fj:,n{t\x) — Fj:,{t\x))t in D{[0,t]) with 7ro(r|x) < 1 
was first established by Dabrowska ( 1987[ ). An important problem is to establish conditions that 
ensure that the weak convergence can be extended to D{[0,to]) where to := sup{s : 7ro(s|x) < 1}. 
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In the unconditional case, such conditions were derived by Gill (1983) who used counting pro 



cess techniques. A generalization of this method to the conditional case was first considered by 



McKeague and Utikal (1990) and later exploited by Dabrowska (1992b) and Li and Doss (1995). 



However, none of those authors considered weak convergence on the maximal possible interval 
[0,to]- The following Theorem provides sufficient conditions for the weak convergence on the 
maximal possible domain. 

Theorem 3.11 Assume that for some e > 

(Rl) The conditional distribution functions FD(.|a;) and Fb{.\x) have densities, say /£)(.|a;) and 
/b(.|x), with respect to the Lebesque measure 

ri?5;sup,=,_,/„*"^3|^dt<oo, 

(R4) ^'^Vj,k=i,...4^^'9{t,y)^(0M)xu,(x) \dyAM^Ay)\ < oo. 

(R5) 1 — Fz(t\y) > C(l — Fzit\x)) for all (t, y) G (to — ^^ to] x I where I is a set with the property 
Iinu (x) fxis)ds > c6'^ for some c > and all < 6 < e . 

(R6) \i:){t\y) = A/p(t|x)(l + o(l)) uniformly in t E (tg — £,to] as y -^ x. 

Moreover, let the weights in (3.5) satisfy condition (Wl)\ and let the weak convergence 

Vn^(Fz,„(.|x) -Fz(.|x),7ro,„(.|x) -7ro(.|x)) ^ (G,Go) on L'([0,cx))) 

to a centered Gaussian process {G, Go) with covariance structure given by 

Cov{Go{s\x),Go{t\x)) = b{x){7io{s At\x) — TTo{s\x)nQ(t\x)) 
Cov{G{s\x),G{t\x)) = b{x){Fz{sAt\x)-Fz{s\x)Fz{t\x)) 
Cov{Go{s\x),G{t\x)) = b{x){iTQ{s At\x) — 7!'Q{s\x)Fz(t\x)) 

for some function b{x) hold [this is the case for Nadaraya- Watson or local linear weights, see 



Lemma 3.3L Then under assumption (Bl) 
(3.8) 



V^^{FDM^)-FD{-\x))t^GD{.) in Z^([0,to]), 



where Go denotes a centered Gaussian process with covariance structure taking the form 



Cov(G'z5(t), Gd{s)) = 6(x)(l - Fd{s\x)){1 - FD{t\x)) 







1 - Fz(u\x) 
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4 Finite sample properties 

We have performed a small simulation study in order to investigate the finite sample properties of 
the proposed estimates. An important but difficult question in the estimation of the conditional 
distribution function from censored data is the choice of the smoothing parameter. For conditional 
right censored data some proposals regarding the choice of the bandwidth have been made by 



Dabrowska ( 1992b ) and Li and Datta (2001 ). In order to obtain a reasonable bandwidth parameter 



for our simulations, we used a modification of the cross validation procedure proposed by Abberger 



(2001) in the context of nonparametric quantile regression. To address the presence of censoring 



in the cross validation procedure, we proceeded as follows: 

1. Divide the data in blocks of size K with respect to the (ordered) X-components. Let 
{(YjTc, Xjfc, 5jk)\ j = 1, . . . , Jk] denote the points among {{Yi, Xj, (5i)| i = 1, . . . , n} which fall 
in block fc (fc = 1, . . . , K). For our simulations we used K = 2b blocks. 



2. In each block, estimate the distribution function Ft as described in Section 2.1, Denote the 
sizes of the jumps at the jth uncensored observation in the kth. block by Wjk 

3. Define 

K Jfc 

h := argmin„ ^^WjkpriYjk - q^J'{T\Xjk)) 
fc=i j=i 

where p^ denotes the check function and q^^ is either the estimator q^^ or q with bandwidth 

a based on the sample {(Vj, Xj, (5j)| i = 1, . . . , n} without the observation (1^-^, Xjk, Sjk). 

For a motivation of the proposed procedure, observe that the classical cross validation is based 
on the fact that each observation is an unbiased 'estimator' for the regression function at the 
corresponding covariate. In the presence of censoring, such an estimator is not available. There- 
fore, the cross validation criterion discussed above tries to mimic this property by introducing the 
weights Wjk- A deeper investigation of the theoretical properties of the procedure is beyond the 
scope of the present paper and postponed to future research. In order to save computing time 
the bandwidth that we used for our simulations is an average of 100 cross validation runs in each 
scenario. 

For the calculation of the estimators of the conditional sub-distribution functions, we chose local 



linear weights [see Remark 3.1 with a truncated version of the Gaussian Kernel, i.e. 

K{x) = 0(x)/{^(a;)>o.OOl}, 

21 



where denotes the density of the standard normal distribution. 

We investigate the finite sample properties of the new estimators in a similar scenario as models 2 



and 3 in Yu and Jones (1997) [note that we additionally introduce a censoring mechanism]. The 
first model is given by 



(model 1) 



Ti = 2.5 + sm{2Xi) + 2exp(-16X2) + 0.5Af{0, 1) 

L, = 2.6 + sin(2X,) + 2 exp{-lQXf) + 0.5(Ar(0, 1) + go.i) 

R, = 3.4 + sin(2Xi) + 2exp(-16Xf) + 0.5(Ar(0, 1) + go.g) 



where the covariates Xi are uniformly distributed on the interval [—2, 2] and qp denotes the p- 
quantile of a standard normal distribution. This means that about 10% of the observations are 
censored by type 6 = 1 and 6 = 2, respectively. For the sample size we use n = 100, 250, 500. In 
Figures [2] and [T] we show the mean conditional quantile curves and corresponding mean squared 
error curves for the 25%, 50% and 75% quantile based on 5000 simulation runs. The cases where 
the g^^(r|x) is not defined are omitted in the estimation of the mean squared error and mean 
curves [this phenomenon occurred in less than 3% of the simulation runs]. Only results for the 
the estimator q^^ are presented because it shows a slightly better performance than the estimator 
q. We observe no substantial differences in the performance of the estimates for the 25%, 50% 
and 75% quantile curves with respect to bias. On the other hand it can be seen from Figure [l] 




Figure 1: Mean squared error curves of the estimates of the quantile curves in model 1 for 
different sample sizes: n = 100 (dotted line); n = 250 (dashed line); n = 500 (solid line). Left 
panel: estimates of the 25%-quantile curves; middle panel: estimates of the 50%-quantile curves; 
right panel: estimates of the 75%-quantile curves. 10% of the observations are censored by type 
6 = 1 and 6 = 2, respectively. 

that the estimates of the quantile curves corresponding to the 25% and 75% quantile have larger 
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variability. In particular the mse is large at the point 0, where the quantile curves attain their 
maximum. 












Figure 2: Mean (dashed lines) and true (solid lines) quantile curves for model 1 for different 
sample sizes: n = 100 (left column), n = 250 (middle column) and n = 500 (right column). Upper 
row: estimates of the 25% quantile curves; middle row: estimates of the 50% quantile curves; 
lower row: estimates of the 75% quantile curves. 10% of the observations are censored by type 
6 = 1 and 6 = 2, respectively. 
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As a second example we investigate the effect of different censoring types. To this end, we consider 



a similar example as in model 3 of Yu and Jones (1997), that is 



(model 2) 



Ti = 2 + 2 cos{Xi) + exp(-4X2) + ^(1) 

L,; = 2 + 2 cos(Xi) + exp(-4Xf ) + (c^ + W[0, 1]) 

Ri = 2 + 2 cos(XO + exp(-4Xf ) + {cr + S{1)) 



where the covariates Xj are uniformly distributed on the interval [—2, 2], S{1) denotes an exponen- 
tially distributed random variable with parameter 1, W[0, 1] is a uniformly distributed random vari- 
able on [0, 1] and the parameters {cl, cr) are used to control the amount of censoring. For this pur- 
pose we investigate three different cases for the parameters (cl, c/j), namely (—0.5, 1.5), (—0.5, 0.5) 
and (-0.2,1.5), which corresponds to approximately (10%, 11%), (30%, 11%) and (11%, 25%) of 
type 5=1 and 5 = 2 censoring, respectively. The corresponding results for the estimators of the 
25%, 50% and 75% quantile on the basis of a sample oi n = 250 observations are presented in 
Figures 3 and 4. 






Figure 3: Mean squared error curves of the estimates of the quantile curves in model 2 for different 
censoring: (10%, 11%) censoring (dotted line); (30%, 11%) censoring (dashed line); (11%, 25%) 
censoring (solid line). Left panel: estimates of the 25%-quantile curves; middle panel: estimates 
of the 50%-quantile curves; right panel: estimates of the 75%-quantile curves. The sample size is 
n = 250. 

We observe a slight increase in bias when estimating upper quantile curves. An additional amount 
of censoring results in a slightly worse average behavior of the estimates. More censoring of type 
5 = 2 has an impact on the accuracy of the estimates of the lower quantiles, while more censoring 
of type 5 = 1 has a stronger effect for the upper quantile curves. Upper quantile curves are always 
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estimated with more variability which is in accordance with the factor 1/ fT{Frp'^(p\x)\x) in their 
hmiting process. 






Figure 4: Mean (dashed lines) and true (solid lines) quantile curves for model 2 and different 
censoring: left column: (10%, 11%) censoring; middle column: (30%, 11%) censoring; right col- 
umn: (11%, 25%) censoring. Upper row: 25% quantile curves; middle row: 50% quantile curves; 
lower row: 75% quantile curves. The sample sizes is 250. 
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A Appendix: Proofs 



Proof of Lemma 13.31 We begin with the proof of the first part. Recalling the definition of 
the Nadaraya- Watson weights in (3.2), we see that (Wl)Kl) follows easily from the inequality 
ci < K(x) < C2 for all x in the support of K. Conditions |(W1)K2)| and (Wl)[3) hold with 
C{x) = fxix), which is a standard result from density estimation [see e.g. Parzen ( |1962 ) . 
Finally, for assumption (W1)K4) we note that, as soon as the function /x(.)-FV(t|.) is continuously 
differentiable in a neighborhood of x with uniformly (in t) bounded derivative, we have 

sup||^E[5^ir,(x -X,)(x -X,)/|y^<i}] II = 0(/i2). 



From standard empirical process arguments [see for example Pollard (|1984|)] we therefore obtain 
1 



sup—\\'^Khix - Xi){x - Xi)I{Y,<t} - E 1^^ Khix - Xi){x - Xi)I{Y,<t} 



O 



K^ log n 



a.s. and the assertion now follows from condition (Bl)[ 



To see that we can also use the local linear weights defined in (3.3), we note that 



(A.l) 
(A.2) 
(A.3) 



Snfi = /x(x)(l + Op(l)) 

^„,i = h''ii2{K)f^{x) + op{h'') 

Sn,2 = h^fi2{K)fx{x)+0p{h^) 



and from the compactness of the support of K, which implies: \x — Xj\ = 0{h) uniformly in j, 
we obtain the representation V^"^ = V/^^{1 + op(l)) uniformly in i. Conditions (Wl)[l) and 
(Wl)[4) for the local linear follow from the corresponding properties of the Nadaraya- Watson 
weights (possibly with slightly smaller and larger constants c and c, respectively). 

Finally, from the fact that, with probability tending to one, the local linear weights are positive, 
it follows that the corresponding estimators if„, Hni are increasing and hence unchanged by the 
rearrangement. This implies P(3z G l,...,n : W^"^ ^ Wf^^j — > 0, where W^^^ denote the 
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weights of the rearranged local linear estimator. Thus condition (Wl) also holds for the weights 
Wl"^^ and the proof of the first part is complete. 

For a proof of the second part of the Lemma we note that the same arguments as given in 
Dabrowska (1987), Section 3.2, yield condition |(W2) for the Nadaraya-Watson weights [here we 
used assumptions (D7), (D8) and |(Bl)] . 
The corresponding result for the local linear weights can be derived by a closer examination of 



the weights W^"^ . For the sake of brevity, we will only consider the estimate Hn defined in (2.5), 
the results for iffc,„ {k = 0, 1, 2) follow analogously. From the definition of the weights W^^^ we 
obtain the representation 



H^\t\x) 



1 y- ^(^)(^n.2 

nh ^ 



\x 



^i)Sn,\) 



i=\ 



C Q Q2 



Hy^<t} 



71,1 



1 ^ir(^) 



nh 
H, 



i=l 
NW 



Sn,0 1 — S^'^^/{Sn,oSn,2) 



1 ^ K{^^Q_^x-JQS^ 



{Yi<t} 



4 = 1 



uniformly in t where the last equality follows from the estimates if„ (t|x) = Op(l) and (A.l) 



- (A. 3). Now condition (Bl) ensures h = o{l/ynh) and thus the difference H^ — Hf^ 



tLL 



IS 



asymptotically negligible. From Lemma 3.4 we immediately obtain that, with probability tending 



to one, the rearranged estimators H!^^^ and H^^^ defined in (2.20) and (2.21) coincide with the 
estimates H!^^ and H^^ respectively. Thus condition {W2) also holds for (i?^^^, H^^^^ , H^^^) and 



the second part of Lemma [3. 3| has been established. 

We now turn to the proof of the last part. Again we only consider the process if„(.|x), and note 
that the uniform consistency of Hk,n{-\x) follows analogously. First, observe the estimate 



E 



n. 



¥ 



Y,Kh{x-X,)I. 



{y^<t} 



1 
1? 



Kh{x - u)FY{t\u)fx{u)du = fx{x)FY{t\u){l + o(l)) 



uniformly in t, which is a consequence of condition (D2) From standard empirical process argu- 



ments [see Pollard (1984)] it follows that almost surely 



sup 

i 



1 

nh'^ 



E^'^^^ 



Xi)I{Y,<t} - E 



1 
.nh'^ 



j:'<"' 



X 



X.M. 



jj-'{y,<t} 



O 



log 72 



n. 



¥ 



and with condition (B2) the assertion for the Nadaraya-Watson weights follows. The extension 



of the result to local linear and rearranged local linear weights can be established by the same 
arguments as presented in the second part of the proof. □ 



27 



Remark A.l Before we begin with the proof of Theorem 3.5, we observe that condition (Wl) 
imphes that we can write the weights Wi{x) in the estimates (2.5) in the form 

where An is some event with Pfy4„j — )■ 1, W^ (x) = Vi{x)/J2j'^ji^) ^^^ WJ; (x) denote some 
other weights. If we now define modified weights 



^(1)/ 



rNWl 



W.ix):=Wrix)lA„ + Wrix)lAC, 

where Wf^^{x) denote Nadaraya- Watson weights, we obtain: P(3i G 1, ...,n : Wi ^ Wi) — )• 0, i.e. 
any estimator constructed with the weights Wi{x) will have the same asymptotic properties as an 
estimator based on the original weights Wi{x). Thus we may confine ourselves to the investigation 



of the asymptotic distribution of estimators constructed from the statistics in (2.5) that are based 
on the weights Wi{x). In order to keep the notation simple, the modified estimates are also 



V^ix) 



with 



denoted by Hn,Hk,n, etc. Finally, observe that we have the representation Wi{x) 

Vi := Vilji^ + V/^^{x)Iac. Note that by construction, the random variables V^, satisfy conditions 

(Wl) ^1) (Wl)[4) if the kernel in the definition of Wf^^{x) satisfies assumption (Kl). 



Proof of Theorem 13. 5t Let S denote the set of pairs of functions {H2{.\x), H{.\x)) of bounded 
variation such that H{.\x) > /3 > 0. Since the map {H2{.\x),H{.\x)) i— )■ M2{-\x) is continuous 



on S with respect to the supremum norm [see the discussion in Anderson et al. (1993) following 
Proposition II. 8. 6], and iJ„ is uniformly consistent [which implies P((if2,n; -f^n) ^ -S*] — ;■ 1], the 
weak uniform consistency of M^^ on [too + ^) oo) [e > is arbitrary] follows from the uniform 
consistency of if2,n and if„. This can be seen by similar arguments as given in Dabrowska (1987), 
p. 184. 

Moreover, the map M2~(.|a:) h-> Fl{.\x) is continuous on the set of functions of bounded variation 
[reverse time and use the discussion in Andersen et.al. (1993) following Proposition II. 8. 7], and 
thus the uniform consistency of Fl,„(.|x) on [too + £, oo) follows for any positive e > 0. 

In the next step, we consider the map 

Ho^n{dt\x) 



{Ho^ni-\x), Hni.\x), FL,ni-\x)) f-^ A^.n! 



\X] 



k FL,nit - \x) - Hnit - \x) 

and split the range of integration into the intervals [0,too + £) and [too + S-,t). The continuity of 
the integration and fraction mappings yields the uniform convergence 



(A. 4) sup 

te[to()+e,T) 



Ho^nidt\x) 



[tQo+e,t) 



FL,n{t 



\X] 



Hnit 



\X] 



Ho{dt\x) 



[too+e,t) 



Fdt 



\x] 



Hit 



\x] 
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for any r with Fs{t\x) < 1 [note that mit£[too+e,T) Fiij: — \x) — H(t — |x) > since Fiit 



\x) — H{t — |x) = F^it — \x){l — Fs{t — \x)) and ^^(too — |a;) > by assumption (Dll) and 
continuity of the conditional distribution function Fl{.\x)]. We now will show that the integral 
over the interval [0, too + ^) can be made arbitrarily small by an appropriate choice of e. To this 
end, denote by Wi{x,n), ...,Wk{x,n) those values of Yi,...,Yn, whose weights fulfill Wi{x) ^ 
and by Vr(i)(x,n), ..., W{k)[x^n) the corresponding increasingly ordered values. By Lemma B.2 in 
Appendix B we can find an £ > such that: 

sup - — -, ^ T^, ^ = Op(l), 

ioo+£>i>VK(2)(a;,n) -rL,nlS — \X) — nn\S — \X) 

and it follows 

\ -E^-, r\ W( ^ - ^o,n(too + e\x)Op{l). 

J[W^2)(x,n),toa+e) ^ L,n\S — \X} — rln\S — \X ) 

Therefore it remains to find a bound for the integral L „, , .. -^ — "■"'- '^j^' — ^--_ For this purpose 
we consider two cases. The first one appears if the 6i corresponding to W(^i){x,n) equals 0. 
In this case there is positive mass at the point W(^i){x,n) but at the same time FL^n{s\x) = 
FL,n{W{2){x,n)\x) for all s G [0,W(^2){x,n)) and hence J^o^tg^+e) FlA^-\x)-h1(s-\x) ^ ^o,n(too + 
e\x)Op{l). For all other values of the corresponding 5i the mass of HQ^n{,ds\x) at the point 
W{i) (x, n) equals zero and thus the integral vanishes. Summarizing, we have obtained the estimate 

I H,,n{ds\x) ^ ^ ^ e\x)Op{l) = Hoitoo + e\x)Op{l), 

J[0,too+e) FL,n{s - \x) - Hn[s - \x) 

where the last equality follows from the uniform consistency of ifo,n and the remainder Op(l) 
does not depend on e. Moreover, since the function Ar^„(.|x) is increasing [see Lemma [2. 3| , the 
inequality 

(A.5) sup \KtM^)\= I H, {ds\x) ^<^^(^^^^^|^)Q^(^) 

t<too+e J[0,too+s) ^L,nX^ " FJ - -H„(^S - \X) 

follows. Now for any 5 > we can choose an cs > such that ifo(^oo + £s\x) < 6 [recall the 



definition of too in (3.1)] and we have 



P( sup |AT,n(t|x) -AT{t\x)\ >2a) <P( sup |AT,n(t|x)| > a) <p(Op{l) > a/ 6 



^ie[0,too+e«) ^ ^te[0,too+e«) 



whenever AT{too + e\x) < a, where the last inequality follows from (A.5) and the remainder Op(l) 
does not depend on a and 6. From this estimate we obtain for any r with Fs{t\x) < 1 

P ( sup I Ar,n(t|x) -Ar(t|x) | > 4a) < P ( sup |Ar,n(t|a;) -Ar(t|x) | > 2a) +P (Op{l) > a/ 5) . 
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By (A. 4) The first probability on the right hand side of the inequality converges to zero as n tends 



to infinity for any a,es > 0, and the limit of the second one can be made arbitrarily small by 
choosing 5 appropriately. Thus we obtain lim„_>.oo Pfsup^gjQ,-) |Ar.„(t|a;) — Aj'(t|x)| > 4aj =0, 
which implies the weak uniform consistency of Ar„(.|a;) on the interval [0, r). 



Finally, the continuity of the mapping At ^-)■ Ft [see the discussion in Anderson et al. (1993) 
following Proposition II. 8. 7] yields the weak uniform consistency of the estimate Ft^u and the first 
part of the theorem is established. 



For a proof of the second part, we use an idea from Wang (1987). Note that, as soon as Fr_„(.|a; 



is increasing and bounded by 1 from above, we have the inequality sup4>^ \FT^nit\x) — FT(t\x)\ < 
|FT,„(a|x) — Fr(a|x)| + (1 — FT{a\x)). Thus 

sup |Fr,„(t|x) - FT{t\x)\ < 2 sup |FT,n(t|x) - FT{t\x)\ + 2(1 - FT(a|x)), 

i>0 0<i<a 

and by assumption and part one of the theorem we can make 1 — FT{a\x) arbitrarily small with 
uniform consistency on the interval [0, a] still holding. Consequently, we obtain the uniform con- 
sistency on [0, oo), which completes the proof of Theorem 3.5 □ 



Proof of Theorem 13. 6t The second part follows from the first one by the Hadamard differ 
entiability of the map A i-> n(tooi(-'- ~ A.{ds)) in definition (2.10) [see 



Patilea and Rolin 



(2001), 



Lemma A.l] and the delta method |Gill| ( 1989[ )]. Note that these results require a.s. continuity of 
the sample paths which follows from the fact that the process Gm defined in the first part of the 
Theorem has a.s. continuous sample paths together with the continuity of Fl{.\x). 
The proof will now proceed in two steps: first we will show that weak convergence holds in 
D^{[a, oo]) for any a > too and secondly we will extend this convergence to D^{[toQ, oo]). Note that 



from condition (D4) we obtain FL{tQQ\x) > 0, and the continuity of Fi{.\x) yields too > 0. 
Set e > and choose a > too such that H{(r\x) > e. Recall that the map 



{H,Ho,H2)^{H,Ho,M^) 



is Hadamard differentiable on the domain D := {{Ai,A2,A3) e BV{^{[a, oo]) : Ai > 0, A3 > e/2} 
Patilea and Rolin (2001)] and takes values in i?Vj([o", 00]). Here BVc denotes the space of 



see 



functions of bounded variation with elements uniformly bounded by the constant C. Moreover, 
assumption (W2) implies weak convergence and weak uniform consistency of the estimator if„ 
on D{[a,oo]). Therefore {Ho^n, H2,n, Hn) will belong to the domain D with probability tending 
to one if n — )■ 00. Hence, we can define the random variable Hn '■= lAnHn + Iac where An := 
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{mfig[o-^oo] Hn{t) > e/2}, which certainly has the property Hn > e/2 on [a, oo] almost surely. Now, 
since P{Hn 7^ Hn] = 1 — P(^n) — ^ 0, the weak convergence result in (W2) continues to hold on 



/^^([cr, 00]) with Hn replaced by Hn- By the same argument, we may replace the Hn in the 
definition of M2n by Hn without changing the asymptotics. Thus we can apply the delta method 
[see Gill (1989), Theorem 3] to {HQn,H2^n,Hn) and deduce the weak convergence 

V^d^Hn-H,Ho,n-Ho,M2^^-M2)^{G,Go,GMj in D3([a,oo]). 

To obtain the weak convergence in D^([too, 00]), we apply a Lemma from Pollard (1984, page 70, 
Example 11). First define Cm as the pathwise limit of Gm„{<^) for a 4 ^OO) the existence of this 



limit is discussed in Remark |3.7[ Note that there exist versions of Gm, G, Gq with a.s. continuous 
paths (this holds for G and Go by assumption, whereas the paths of Gm are obtained from those of 



G2, G by a transformation that preserves continuity [see equation (3.4)]), and hence the condition 
on the limit process in the Lemma is fulfilled. 

Hereby we have obtained a Gaussian process Gm on the interval [too? 00] and have taken care of 



condition [iii) in the Lemma in Pollard (1984). For arbitrary positive e and 6 we now have to 
find a. a = a{5,e) > too such that 



P 



(A.6) 

(A. 7) limsupP 



sup |GAf(t)| > 5 

too<t<a 



< e 



sup \/nh'^ I (M2"„ - M2")(a - \x) - (M 

too<*<o" 



2,n 



M2-)(t 



\X] 



> 5] <e. 



Note that once we have found a a such that [A.l] holds, we can make a smaller until (A.6) is 



fulfilled with [A.l] still holding. This is possible because for every 5 > 0, we have 

Ivnialtoo P (supipg<j<^ |GAf(t)| > (5) = 0, which can be established as follows. Define the function 

K{t) := j'^ hLix) ^^^ denote by Wt a Brownian motion on [0, 00]. Then we have 



H{s\x) 



Covi^/b(x)W^(^s), VK^W^^t)) = bix)iK{s) A K{t)) = h{x) 



sVt 



M2{ds\x) 
Hislx) 



Cov(GM(s),GM(t)), 



where the last equality follows from Remark 3.7 Thus we have represented the process Gm in 
terms of a Brownian motion and the assertion follows from the finiteness of fi;(too) [by assumption 
(D4)| and the properties of the Brownian motion. 



In order to prove the existence of a constant a that ensures (A.7), we reverse time and transform 



our problem into the setting of conditional right censorship [see Section 3.3 . To be more precise, 
define the function a{t) := ^ which is strictly decreasing and maps the interval [0, 00] onto itself. 
Consider the random variables Bi := a{Si), Di := a{Li), Zi := Bi A Di and Aj := I{Di<Bi} = 
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I{Si<Li}- This is a conditional right censorship model with the useful property that A^(.|Xj), the 
predictable hazard function of Di, is closely connected to the reverse hazard function M2~(.|Xj) 
by the identity 

A^(a(t)|x) = M2"(cx)|x) - M^{t - \x) 

It is easy to verify that the conditional Nelson-Aalen estimator A]^^{dt\x) in the new model is 
related to the estimator M^n in a similar way, i.e. A^„(a(t)|x) = M2"^(oo|a;) — M2^{t\x). Thus to 



prove {A.7) it suffices to find a a such that in the new model the following inequality is fulfilled 



(A. 



8) limsupPf sup yn/i^|(A^„- A^)(t|x) - (A^„- A^)((T- |x)| > (5] <e, 



where we define to = '^(^oo) < oo. This assertion is established in the proof of Theorem 3.11| [note 



that the assumptions (R2) (R6) can be directly identified with the assumptions of Theorem |3. 6 



n 



Proof of Theorem 13. 8t First of all note that the a.s. continuity of the sample paths of the 
processes V{.) and W{.) follows because these processes are constructed from processes which 
already have a.s. continuous sample paths in a way that preserves continuity. Thus it remains to 



verify the weak convergence. From Theorem 3^ we obtain 

(A.9) V^^{H^ - H, Ho,n - Ho, Fi,„ - Fl) ^ (G, Go, G3) 

in D^{[too, C)o]). Now from Fl{s — \x) — H{s — \x) = Fl{s — |a;)(l — Fs{s — \x)) and the definition 
of r it follows that 

Fii-s - \x) - H{s -\x)>e>0 Vs G [too, r] 

[note that the inequality Filtoo — |x) > was derived at the beginning of the proof of Theorem 
|3.6| . For positive numbers 5 define the event 

An{S) :-- 



inf (Fi,„(t|x) - H^{t\x)) >6}. 
ie[too,T) 



Because of (A.9) [which implies the uniform consistency of Fi^„(.|x) and iJ„(.|x)], we have that 
ioT 6 <e P{Ia„{5) ^ 1) '^ 0. Define Hn := HnlA,,{S), -^o,n := Ho,nlA„{5) and FL,n ■= FL,nlA„{5) + 



/^c(5), then it follows from (A. 9) 



V^^{FL,n-FL-{H^-H),Ho,n-Ho)^{G;-G,Go) in /^^([tocr]) 
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Moreover, the pair {Ho^n, Fi^n - H^) is an element of {{A, B) e BV^{[tQQ, r]) : A > 0, S > 5 > 0}. 
Since the map {A, B) i-t- /^ -^^ is Hadamard differentiable on this set [see 



Anderson et aL 



( [I993| ),page 113], the delta method [see pill| p989| )] yields 

HQ^n{,ds\x) 



Vnh?- 



<00 



L,n 



' s — \x) — HJs — \x) 



-A 



T\ 



\X 



in D{\tQQ,T 

A. 



Finally, observe that for t > too "we have 
Ho^n{ds\x) 



T,n 



{t\x) 



km 



FlJs 



\X} 



Hn{s 



+ 



\x] 



v{. 



Ho^n{ds\x) 



[0,too) 



FlA^ - \x) 



Hn{s — \x) 



and thus it remains to prove that the second term in this sum is of order o p{l / y nh'^) . From 
Lemma 



B.2 



in the Appendix B we obtain the bound: &v^Vum>t>w 



\2){x,n) FL.n{s-\x)-H„(s-\x) 



Op(l), 



where W(2){x^n) is defined in the proof of theorem 3.5, and it follows 

HQ^n{,ds\x) 



j 



[VF(2)(a::,n),too) ^ L,n\ 



\x) -HJs 



< ifo,n(too|a;)Op(i: 



Standard arguments yield the estimate ifo,n(^oo|2;) = op{l/ynh'^) and thus it remains to derive an 



estimate for the integral L 



Ho^ri{ds\x) 



For this purpose we consider two cases. The 



' [0,^(2) (x,n)) FL,r.{s-\x)-H^{s-\x)- 

first one appears if the 5i corresponding to 14^(1) (x, n) equals 0. In this case there is positive mass at 
the point W(i){x^n) but at the same time FL^n{,s\x) = FL,n(W(2){x,n)\x) for all s G [0,W(^2){x,n)) 
and hence L^ . -^ — (g_l\^)_H (s-m — Ho^nitoo\x)Op{l). For all other values of the corresponding 
6i the mass of HQn{ds\x) at the point W^(i)(x, n) equals zero and thus the integral vanishes. Now 
the proof of the theorem is complete. □ 



Proof of Theorem 13. 9t Note that the estimator Fr[^{.\x) is nondecreasing by construction. The 
assertion for q^^{. \x) now follows from the Hadamard differetiability of the inversion mapping tan- 



gentially to the space of continuous functions [see Proposition 1 in Gill (1989)], the continuity of 



F^i-lx) and the weak uniform consistency of Fj.^{.\x) on the interval [0, r]. The corresponding 
result for the estimator q{.\x) follows from the convergence P(g'^'^(.|x) = g(.|x)j — )■ 1 [see the 



discussion after Lemma 3.4 . 



D 



Proof of Theorem 13.101 Observe that the estimator F^^{.\x) is nondecreasing by construc- 

yields vnh^{Fj,^{.\x) 



tion and that Theorem 



3.8 



F^{.\x)) => W{.) on D{[0,T + a]) for some 
a > where the process W has a.s. continuous sample paths. Note that the convergence holds 
on D{[0,T + a]). This follows from the continuity of Fs{.\x) and Fj^^{.\x) at r which implies 
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Fs{Fj,^(t + a\x)\x) < 1 for some a > 0. By the same arguments fT{-\x) > 6 > on the interval 



[e — a,T + a] if we choose a sufficiently small. Thus Proposition 1 from Gill (1989) together with 
the delta method yield the weak convergence of the process for q 



IP I 



\x) 



for q{.\x) follows from the fact that P(g 



^IP( 



\X] 



q{.\x)) ^1. 



The corresponding result 

D 



Proof of Theorem 13.111 By the delta method Gill (1989)], formula (3.6), and the Hadamard 



differentiability of the product-limit mapping Anderson et al. (1993)] it suffices to verify the weak 



convergence of ynh^{K^ ^{t\x) — A^(t|a;))t on Z)([0,to])- The corresponding result on Z}([0,r]) 
with T < to follows from the delta method and the Hadamard differentiability of the mapping 
{110,71, Fz^n) *-^ ^Dn- -^^^ ^^^ extcusiou of the converegnce to D([0,to]) it suffices to establish 



condition (A. 8) [this follows by arguments similar to those in the proof of Theorem 3.6 . Define 
the random variable U as the largest Zj corresponding nonvanishing weight Wi{x) i.e. 



U = U{x) := max JZ^ : Wi{x) ^ 0| . 
Note that for t > U we have Fz^n{t\x) = 1 for the corresponding estimate of Fz{.\x). We write 

d (Wi{x)I{z,<t,A.=i} 



^oJy-l^) 



i=l 
n 

E 

i=l 

n 



lo,y) 'Z]=iWj{x)I{Zj>t} 

Wi{x)I^z,>t}d (/{z.<f A=i}) 
[0,?/) Tri=i^jW{z,>t} 



for the plug-in estimator of A^ 

C,{x,t): 



i=l 

Ix), where 



[0,?^) 



C.(x,t)/|,_^^^(,_l^.)>o|rfAri(t) 



Wi{x)I{z,>t} 



VAx)I. 



{z^>t} 



and the quantity Ni{t) is defined as Ni{t) := /{z,<t,Ai=i}- In what follows, we will use the notation 
G{A) = f^G{du) for a distribution function G and a Borel set A. With the definition 

n „ 

we obtain the decomposition 

|(A-^„-A^)((a,t]|x)| < |(A^^„-A^_J((a,f/At]|x)| + |(A^^„-A^^J((f/At,t]|x)| 

+ |(Az5,„-A^)((a,t]|x)|. 
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Observing that AB,n((f^ A t, t]) = AB,n((f^ A t, t]) = it follows that 

l(AD,n-AD,n)((f^At,t]|x)| = 0, 

|(AB,„-AB)((^,t]k)| < |(AB,„-A^)((a,[/At]|x)|+A^((f/At,t]|x), 
sup |(A^,„-A^)((a,tAf/]|a;)| < sup |(A^_„ - A^)((a, t]|a;)| 

o-<«to (T<t<UAto 



where we set the supremum over the empty set to zero. Hence assertion (A. 8) can be obtained 
from the statements 



(A.IO) 
(A.ll) 



VnJ? sup Ajj{{U At,t]\x) 

CT<t<to 







^^ sup |(A^„-A^)((a,t]|x)| 

a<t<U/\to 







(A.12) limsupP(V^ sup |(A^„-A^_„)((o-,f/At]|x)| >(5) <e/2, 

which will be shown separately. 



Proof of (A.IO) For a proof of (A.IO) note that 



A^iiUAt,t]\x) 







u>t 



A^((f/,t]|x) , U<t 



and A^{{U,t]\x) < A^((f/ A tQ,tQ]\x) whenever U < t < Iq. Hence, the supremum in (A.IO) can 
be bounded by 

[A) sup A^((f/At,t]|x)< A^((f/Ato,io]|a;). 

(T<t«o 



Observing (R2) we have ^^([to, cxd]|x) > [note that kj~,{dt\x) = jt^^tt^I and obtain 

A^(([/Ato,to]|x)< 



{B) 



-FD{t-\x)\ 

FD{dt\x) Fd{{U M^M\^) 



[UAtoM] FoiXh, oo]\x) FB([to,oo]|x) 



/ p 

Observing (A) and (B) it suffices to verify the convergence ynh'^F£,{{U /\to,tQ]\x) — ;■ 0. For this 

purpose we introduce the notation 

u'^ = u'^{x) :=inf |s: V^FD{{s,tQ]\x) < a\ 
[note that m" < to]- Then we obtain for any fixed a > and sufficiently large n 
P (y^FniiU Ato,to]\x) > a) < E [/{c/Ato«}] = E [E [/|[;mo«}|^i, •••,^n]] 
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< E 



1 ~ kZj>u'il}I^W,{x)^0} 



Xi, ...,Xi 



n 

< E[n {1 - E [/{z,><}|X,] /{,|x,-.i|<c„}} 

n 

= E[n{l-^z(K>oo]|X,)/{„x,-.ii<c„}} 



n 

< E[n{l-i^z(K,oo]|X,)/|x,ef/.„(.)n/}}' 

< E[n {1 - CFziK, oo]|x)/{x,ec/.„(x)n/}} 



nil- CFniK, oo]\x)Fb{K, oo]|x)E [/{x,ea,„(.)n7}] } 



< nil- CFz,(K,to)|a;)FB(K, oo]|x)E [l{x,eu.^i.)ni} 

C "I " 



]} 



i=i 



< 



ii I »i7,d F^ U,a iA T^ ^ 



i=i 



n/i'^FB([<,to)|a;) 



1-C 



a^FB([<,oo]|3:) 
n F,5([<,to)|x) 



cO(l) 



where the inequahties (*), (**) follow from (R5) , the last inequality follows from the definition of 



u° and the 0(1) is independent of j [it comes from the ratio c/h]. Now we have 



FDiK,to)\x) 



< 



F]:,{ds\x) 



< 



FD{ds\x) 



Fb{[K, oo]\x) 7[<,to) Fb{{s, oo]\x) J[u?^,to) Fb{{s,oo]\x)Fd{{s,oo]\x)Fd{[s,oo]\x 

A^{ds\x) 



iKfy) Fz{{s, oo]\x) 
by|(R2) [note that u° — > to if ^ — ^ oo] and hence the proof of (A. 10) is complete. 



Proof of (A. 11) For fixed a < s < U AtQ and sufficiently small h we have 

J2C^ix,t){Xnit\X,)-Xo{t\x))dt 
- i=\ 

\l Y. ^^(^'^) ((^ - ^^)'^.\D{t\x) + h^X - X,)'dl\D{t%){x - X;)\ dt 
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i=l 



< I rX^a(a;,t)(a;-X,y9,AB(t|a;)cit + T J^ C, 



c 

.^{x,t)\\x- XiW^—dt, 



with some positive constant C, where we used (R4) in the last inequahty. The second term in the 
above inequahty can be bounded as foUows 



C 
^2 



£ J2 C^{x, t) \\x-X,fdt < Y /' E ^^(^' tMh')dt < ^{to-a)0{h') = 0{h') = o (^^^ 



where the last inequality holds uniformly in s G [a, to]- Thus it remains to consider the first term, 
which can be represented as follows 



Rr, 



s S^n 






-d^XD{t\x)dt 



ELiMx)J'^ —'i 



Y.Ux)hz, 



><} 



'x - XI 



, [ l-Fz{t-\x)\ d^\D{t\x 



I - Fz,n{t - \x) J l-Fz{t-\x 



-dt 



Now, from condition 



(Wl) 



;3) 



and 



(Wl) 



Op{l/Vn¥) uniformly in t G {a,U Mq), (R3) 



:4) 



E^=i^feW 



Op(i), j:tM^)hz.>t}i^ - X. 



and 



l-Fz(t-x) 



Op(l) uniformly in t G (cr, U Ato) 



[see Lemma 5.3 in the Appendix B] we obtain 

K = op(i/v^) r ^ ^^^^^'^^] dt < opii/v^^) r [^-^^(^^)ii ., = op(i/v^) 



uniformly in s G [cr, to], and hence assertion (A. 11) is established. 



Proof of {A. 1^ Observe that |(A^„ - A^„)((or, t/ A to]|x)| < |L>i(f/ A to) - L'i((t)| , where we 
have used the notation Mj(t) := A^i(t) — J^ /|^.>5}A^((is|Xj) and 



(A.13) 



n „ 

D,{t):=J2 C,{x,t)Ir^_ 

~1 J\0.t] L 



i^Z,n(t-k)>0} 



dMi{t). 



Define J-f := cr(Xj,/{^.<t ^^=1}, -^{z,<t,Ai=o} : ^ = 1, ...,n) and note that Mi are independent locally 



bounded martingales with respect to {J^t)t [see Theorem 2.3.2 p. 61 in Fleming and Harrington 
(1991)]. Moreover, /n_p (t_u)>ol' ^Zj>t} and Vi{x) [and with them Ci{x,t)] are measurable 



with respect to J-t and leftcontinuous, hence predictable. The structure of the 'weights' Ci also 

implies their boundedness. 

Thus for t < tQ -Di(t) is a locally bounded right continuous martingale with predictable variation 
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given by 



„ n 

= / J2'^^'^''^■''^^{l-FzMt-\-)>0}^Dids\X^)■ 



Note that with D\^ Di(t)—Di{a) is also a locally bounded martingale for t G [a, to] with predictable 



variation {Di, Di) (t) — {Di, Di) [a). Hence from a version Lenglart's inequality [see Shorack and 



Wellner (1986), p. 893, Example 1] we obtain 

(A.15) P( sup nh\Di{t)-Di{a)f>e)<'^ + P{Vn>r]), 

where V^ = nh'^ ((-Di, -Di) {U A to) ~ (-Di, Di) (a)). If a is sufficiently close to to it follows 

n 

^C^(x,t)A^(rft|X,) 



Vr, 



nh'' 



a,U/\to] 



i=l 



VHx)L 



{Z,>t} 






A^(rft|X, 



< nh' sup V,{x) / 5^- 

i J[a,UAto] i^^ l-L 

n 



a(x,t) 



FzAt-\x))j:i^,v,i 



x] 



-A^dtlXi 



(*) 



Op{l] 
Op{l] 
Opil] 



a,UAto] j^i 



^a 



aix,t) 



- Fz,n{t - \X)) 
XD{t\x) 



\D{t\x)dt{l^op{l)) 



a,UMo] -L - rZ^nKt - \X) 

\D{t\x) l-Fz{t-\x) 
Ki/Ato] 1 - i^z(t - |x) 1 - Fz,„(t 
AD(t|a:) 



\X] 



-dt{l + Op{l)) 



-dt 



where we have used (R6), (Wl)[l) and (Wl)[3) in equality (*) [note that the (1 + op(l)) holds 



uniformly in i and t] and Lemma B.3| in the last equality. Now we obtain from (R2) the a.s. 
convergence J,^^^^ i i-p (t-lx) '^^ " — ^ ^ ^^"^ hence assertion (A. 12) ist established [first choose r] in 



(A.15) small enough to make rj/e small and then choose a close enough to to]. 



Summarizing these considerations, we have established (A.10)-(A.12) and the proof of the theorem 
is complete. □ 
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B Auxiliary results: technical details 

Lemma B.l Let M be a locally bounded, rightcontinuous martingale on [0, oo) and denote by 
(M, M) the predictable variation of M. Then we have for any stopping time U with P{U < oo) = 1 
and all ri,e > 

pfsnpM^t) >e) < - + P{{M,M){U) > r/) 



Proof: In fact this Lemma is a specific version of Lenglart's inequality [see Fleming and Harrington 



(1991), Theorem 3.4.1]. To be precise note that it suffices to prove that for any a.s. finite stopping 

time T 

(B.l) E[M^(r)] < E[(M,M)(T)]. 

Let Tfc denote a localizing sequence such that M(. Ar^) < k and M^(t A r^) — (M, M){t A t^) is a 
martingale. Define the processes 

Xk{t) := M\t A n), Yk{t) := (M, M){t A n). 



Note that by Theorem 2.2.2 in Fleming and Harrington (1991) {Xk — Yk){t A T) is a martingale 
and hence for all t: 

(B.2) E[Xk{tAT)] = E[Yk{tAT)]. 

Moreover, k > Xk[t A T) — > Xk{T) a.s., and hence we obtain by the Dominated Convergence 
Theorem 

E[Xfc(T)]= limE[Xfc(tAT)]. 

Since the process (M , M) is increasing, we also have 

(M, M){t AT)! (M, M){T) a.s. 
and by the Monotone Convergence Theorem 

E[n(T)]= limE[yfc(tAT)]. 



Combining this and (B.2) we obtain the identity E[Xfc(T)] = E[Y'fc(T)] for all a.s. finite stopping 

times T. Hence we can apply Lenglart's inequality to the process X^ dominated by Y^ which leads 

to: 

Pi,fc := P fsup M\t Ark)>e] <'^ + P ((M, M){U A n) > e) =: ^ + P^^k- 
\t<u / ^ ^ 

Finally, from supt<[; Af^itArk) = snpt<u^^^ M^{t) t sup4<^ M'^{t) and (M, M){UATk) t (M, M){U) 

a.s. as k tends to infinity we obtain the desired result. □ 
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Lemma B.2 Assume that conditions ?? and (DlTJl hold. Denote by Wi(x,n), ...,Wk{x,n) those 
values of Yi, ...,Yn, whose weights fulfill Wi{x) 7^ and by W(i){x,n), ...,W(k){x,n) the corre- 
sponding increasingly ordered values. Assume that the estimators F^^n (^nd Hn are based on 
weights Wi{x) = Vi{x)/ ^^-Vj^x) with Vi{x) satisfying the conditions (Wl)(l)' (Wl)(2), that 
Fs,n{A^) '■— Hn{r\x) / F^^niA^) '^^ consistcnt for some r > too with Fs{r\x) < 1 and that all 
the observations Yi are distinct. Then we have for any h < r: 



1 



sup 



b>s>W,2){^,n) FL,n\S - \x) - Hn[S - \x) 



Op(l] 



Proof: As in the proof of Theorem |3.6| we reverse the time and use the same notation. Write 



K 



a{W(^2){x,n)), V = a{r), w = a{b), then the statement of the Lemma can be reformulated as 

1 



sup 



Op(l). 



With the notation FB,n{s\x] 
pression can be rewritten as 



^,<s<V:, 1 - FD,n{s\x) - (1 - Fz,nis\x)) 

1 — (1 — Fz^n{s\x))/{1 — FD^n{s\x)) the denominator in this ex- 



1 - FD,n{s\x) - (1 - Fz,nis\x)) (1 - FD,nis\x))FB,nis\x) 

[note that FB,n{v\x) = 1 — Fs^ni^ — |x)]. Since Fb^„(s|x) is increasing in s and consistent at some 
point V < w with FB,n{v\x) > 0, we only need to worry about finding a bound in probability for 
the term 1/(1 — Fd^„(s|x)). Such a bound can be derived by exploiting the underlying martingale 
structure of the estimator A^„(t) of the hazard measure. More precisely, using exactly the same 
arguments as given in the proof of Theorem 3.6 and the same notation we obtain A^ ^{t A K,.|x) — 
A^ ^(t AVx\x) = Di(t A Vx), where Di(t) is defined in (A. 13) and is a locally bounded continuous 



martingale on [0, oo) with predictable variation given in (A. 14). The martingale property of -Di(t) 
implies that |-Di(t)| is a nonnegative submartingale and from Doob's submartigale inequality we 
obtain for any /3 > and sufficiently large n 

Pfsup |Di(t)| > ^) < /3E|Di(K)| < /3y^E\DAV^ < /3x/E(Di,Di)(V;.) < /?, / sup A^(K||/), 



y(^Ue{x) 



where we have used the inequality (B.l) from the proof of Lemma B.l and the fact that the 



weights Ci are positive and sum up to one. Note that the expression wsupj^gjy^(^) A^(\4|?/) is 
finite. This follows from condition (Dll), which now reads supygf/^(^.) 1 — Fj:,{jT{y)\y) < 1 since 
we have reversed time, and the relation A^(t|a;) = — log(l — F]:){t\x)). Thus we have obtained the 
estimate supj<y |-Di(t)| = Op{l). 
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From the definition of A^^^(t\x) we can derive the bound sup^ A^,^(t|x) < supj^g^/^^^j Aj^(V^.|y), 

and thus obtain 

(B.3) sup A^„(t|x) < sup \Di{t)\ + sup A^„(t|a;) = Op{l). 

Finally, we note that the estimator -Fd^„(s|x) can be expressed in terms of the statistic A^„(t|a;) 
by using the product limit map as 1 — FD„(t|x) = rifoti (-'- ~ ^d ni^^l^)) ■ ^J exactly the same 



arguments as given in the proof of Lemma 6 in Gill and Johansen ( 1990 ) we obtain the inequality 



1 - FD,n{t\x) > exp (-c(?7)A^ „(t|a;)) a.s. 

whenever < t < \4, where 1 — 27] is the size of the largest atom of A^„ on the interval (0, Vx] 
and c(?7) := — log(r7)/(l — rj) < oo [note that, whenever all observations take distinct values, 
the size of the largest atom of A^ „ on (0, Vx] is less or equal to the largest possible value of 
J2i Wi{x)I{z,=v^A^=l}/ J2i ^ii.^)hz^>V:,} whlch can in turn be bounded by c/{c + c) < 1 uniformly 
in n and thus rj > 0]. The desired bound for 1/(1 — F£)^n{s\x)) now follows from the above 



inequality together with (B.3) and thus the proof is complete. □ 



Lemma B.3 Let {Xi,Yi), ...,{Xn,Yn) denote i.i.d. random variables with F{y\x) := P{Yi < 

Vi(x)Ij 1 

y\Xi = x). Define F(y\x) := ^^ V v'h) ' '"'^^c/i is an estimator of the conditional distribution 
function F{y\x) and assume that the weights weights Vi{x) satisfy conditions (Wl) {IJl (Wl) (3) 



the bandwidth h fulfills nh'^ — )■ oo, /;,—)■ and that additionally the following conditions hold 

1. F{t\x) is continuous at (to,3;o) 

2. 1 — Fz{t\y) > C{1 — Fz{t\x)) for all (t, y) G (to — £, to] x I where I is a set with the property 
Iinu (x) fx{s)ds > c6'^ for some c > and all < 6 < e . 

3. F{to — 6\z) is continuous in the second component at the point z = x 

4- The distribution function G of the random variables Xi has a continuous density g with 
g{x) > 0. 

Then, with the notation U := max{l^ : Vj{x) ^ 0}, we have for n ^ oo 

l-Fjy-lx) 
sup ^ = Op[l). 

0<y<to/\U 1 — Fn{y — \x) 

Proof: Define 

-'^n\y\X) ■ sr-^n j i 

l^i=iH\\x-x,\\<h} 
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and observe the representation 

l-F{y-\x) l-Fn{y-\x)l-F{y-\x) 



1 - F„(y -\x) 1 - F^{y -\x)l- Fn{y - \x) ' 

We now will derive bounds for both ratios on the right hand side. For the first factor we note for 
sufficiently small h for all t G (to — S, to] 



XiEln Uh{x) ^ 1 - F(t - \Xi) > C(l - F{t - \x)) 



This implies 



l-F{t-\x) ^ ^^ 1-F{t- \x) Eik\=^-x,\\<h} E^I{x.eInuUcc)}i^ ' Fjt - \X,)) 

teito-5,to] 1 - Fn{t - \x) te(to-<5,to] Y.i -^{x,e/nc/h{x)}(l - F{t - |Xj)) Y.i -^{||^-x,|l<h}(l - F(t - |Xj)) 

A standard application of the Chebychef inequality yields for an arbitrary set M 

1 1 



P{l\j2{hx.eM}-PiX,eM)) 



>e < 



£2 nP{Xi E M) ' 
and a direct application of this result in combination with assumptions [2l|4l yield 






for every e > 0, which implies 



Pf sup ^^^?^>^ + .)^0 V.>0. 

It now remains to consider the interval [0,to — S\. Observe that condition Isj implies 1 — F{tQ 
5 — \Xj) > 0.5(1 — F(to — S — \x)) if \Xi — x\ is sufficiently small, which yields 

1 - F(t - \x) 1 - F(t - \x) 1 - F(t - \x) 

< ^E—, — T^ < 2- -— — -— < oo 



1 - F„(t -\x)- I- F„(to -6-\x)- 1 - F(to -6-\x) 

for sufficiently large n. Summarizing, we have obtained the estimate 

1-F„(y- \x) 
o<y<to ^-F[y- \x) 

Thus it remains to consider the ratio {1 — Fniij — \x)) / {1 — Fn{y — \x)) . For this purpose note that 
(B.4) l-F{y-\x) = Y,^Y~nx^ = ^^^\,V.{x){l-I,y^,y,) 
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> c 



> c 



Ehv. 



(x)^0}( 



^{Y,<y}^ 



l + Op(l) 1 

C(x) nh'^ 

i 

1 + op{l) J2i ^{||x-Xi||<M(l - hy^<yh 



cfx{x 



C{x) 



E,- h\\^- 



Xj\\<h} 



uniformly in y. In (B.4) the last equality follows from ^ Y.j h\\^-x,\\<h} = fx{x){l + Op(l)), the 
second equality is a consequence of (Wl)[3) and the two inequalities follow from (Wl)[l) and 



(Wl)p)| respectively. Note that the quantity Z]j-^{||^-x,||</i}(l - hY,<y}) / ^j h\\^-x,\\<h} equals 
l — F'^'^{y—\x) where F^^ is the Nadaraya- Watson estimator of F with rectangular kernel. Thus 
it remains to find a bound for (1 — F„(?/— \x)) / {1 — F^^ {y — \x)). Conditionally on Xi, ...,X„, this 
is simply the ratio between 1 — i^„ and 1 — F where Fn is the empirical distribution function of the 
sample {Yi : ||x — Xj|| < h} with sample size J2j ^{\\x-Xj\\<h} and F is the averaged distribution 
function of the corresponding Y^. Since the random variables Yi are independent conditionally on 



Xj, we can apply the results from van Zuijlen (1978) to obtain the bound 
P ( 1 - F^^(t - |x) < /3(1 - Fn{t - \x)) yt<U 



Xi, 



■,Xn 



< 



27r2 



(3' 



3 (i-^r 

Since the right hand side of the last inequality does not depend on any random quantities or their 
distributions, this result also holds unconditionally, and thus the proof is complete. □ 
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